Amazon S3 Glacier
Amazon S3 Glacier is a secure, durable, and low-cost storage service for data archiving and long-term backup. It is designed to provide you with a solution for data that is infrequently accessed and stored for months, years, or decades. With S3 Glacier, customers can reliably store large or small amounts of data at a very low cost.
Key Features
- Low-Cost Storage: S3 Glacier offers very low-cost storage, making it ideal for data that is infrequently accessed.
- Durability: S3 Glacier is designed for 99.999999999% (11 9's) durability, with data automatically replicated across multiple Availability Zones.
- Data Retrieval Options: S3 Glacier offers three retrieval options—Expedited, Standard, and Bulk—to balance cost and retrieval speed.
- Security: S3 Glacier supports encryption of data at rest and in transit, access control using IAM policies, and audit logs via AWS CloudTrail.
- Flexible Management: S3 Glacier integrates with S3 Lifecycle policies to automatically transition objects from S3 to S3 Glacier based on predefined rules.
- Vault Lock: The Vault Lock feature allows you to easily deploy and enforce compliance controls that meet even the most stringent regulatory requirements.
- Integration with AWS Services: S3 Glacier integrates with various AWS services such as AWS Lambda, AWS DataSync, and AWS Storage Gateway for seamless data management and processing.
Common Use Cases
- Data Archiving: Ideal for archiving data that is infrequently accessed, such as compliance records, raw data for analytics, and backups.
- Long-Term Backup: S3 Glacier is used for long-term backup of mission-critical data, offering durable storage at a fraction of the cost of traditional on-premises solutions.
- Compliance Storage: Organizations use S3 Glacier for storing compliance data, with features like Vault Lock providing added security and governance controls.
- Digital Preservation: Suitable for the long-term preservation of digital media, such as photos, videos, and other important files that need to be retained indefinitely.
Architecture Overview
The following diagram illustrates the architecture of Amazon S3 Glacier:
- Vaults: Vaults are containers in S3 Glacier used to store archives (data). Each vault can be configured with policies to control access and retrieval settings.
- Archives: An archive is any data, such as a photo, video, or document, that you store in a vault. Each archive is assigned a unique ID within the vault.
- Data Retrieval: S3 Glacier offers three retrieval options: Expedited (1-5 minutes), Standard (3-5 hours), and Bulk (5-12 hours).
- Access Control: Access to S3 Glacier resources is managed through IAM policies, which allow you to control who can create, access, and manage your vaults and archives.
- Data Encryption: S3 Glacier encrypts data at rest by default, and you can configure it to use AWS KMS keys for additional security.
Integration with Other AWS Services
Amazon S3 Glacier integrates with various AWS services to enhance its functionality and streamline data management:
- AWS S3: Use S3 Lifecycle policies to automatically transition data from S3 Standard or S3-IA to S3 Glacier for long-term storage.
- AWS DataSync: Automate and accelerate moving data from on-premises storage to S3 Glacier using AWS DataSync.
- AWS Lambda: Trigger serverless functions when data is retrieved from S3 Glacier, enabling automated processing or alerts.
- AWS Storage Gateway: Integrate S3 Glacier with on-premises storage through AWS Storage Gateway, facilitating seamless data archiving.
- AWS Backup: Manage and automate backups across AWS services, including S3 Glacier, using AWS Backup.
Things to Remember for the Exam
- Data Retrieval Options: Understand the three retrieval options in S3 Glacier:
- Expedited: Fastest retrieval, suitable for urgent data needs, with retrieval times of 1-5 minutes.
- Standard: Cost-effective retrieval for regular use cases, with retrieval times of 3-5 hours.
- Bulk: Lowest cost retrieval option, suitable for large data sets with retrieval times of 5-12 hours.
- Vault Lock: Know the importance of Vault Lock:
- Enforces compliance controls on vaults, such as WORM (Write Once Read Many) policies.
- Once a Vault Lock policy is set, it cannot be changed or deleted, ensuring data integrity and compliance.
- Data Encryption: Be familiar with encryption in S3 Glacier:
- Data is encrypted at rest by default using AES-256 encryption.
- Optionally, use AWS KMS to manage encryption keys for additional security.
- S3 Lifecycle Integration: Understand how S3 Glacier integrates with S3 Lifecycle policies:
- Automatically transition objects from S3 Standard, S3-IA, or S3 One Zone-IA to S3 Glacier based on lifecycle rules.
- Reduces storage costs by moving infrequently accessed data to lower-cost Glacier storage.
- Use Cases: Be aware of common use cases for S3 Glacier:
- Archiving compliance records, raw data, and backups for long-term storage.
- Digital preservation of important files like photos, videos, and documents.